Search CORE

14 research outputs found

Szótár és enciklopédia az online világban

Author: Pajzs Júlia
Publication venue: Tinta Könyvkiadó
Publication date: 01/01/2015
Field of study

Repository of the Academy's Library

Tulajdonnevek felismerése az Európai Média Monitor magyar moduljának fejlesztésében

Author: Pajzs Júlia
Publication venue
Publication date: 01/01/2015
Field of study

ELTE Digital Institutional Repository (EDIT)

Tulajdonnevek felismerése az Európai Média Monitor magyar moduljának fejlesztésében

Author: Pajzs Júlia
Publication venue
Publication date: 01/01/2015
Field of study

Repository of the Academy's Library

ELTE Digital Institutional Repository (EDIT)

Nyelvtechnológia a lexikográfia szolgálatában

Author: Pajzs Júlia
Publication venue: Károli Gáspár Református Egyetem; L'Harmattan Kiadó
Publication date: 01/01/2014
Field of study

Repository of the Academy's Library

Egy “nyelvészbarát” szövegfeldolgozó eszköz: a NooJ

Author: Pajzs Júlia
Publication venue: MTA Nyelvtudományi Intézet
Publication date: 01/01/2016
Field of study

Repository of the Academy's Library

A készülő Akadémiai nagyszótár számítógépes vonatkozásai

Author: Pajzs Júlia
Publication venue
Publication date: 01/01/2003
Field of study

The project for the Academic Dictionary of Hungarian is presented from computational point of view. The major steps are the following: collection of the 25 million running word Historical Corpus of Hungarian, lemmatization, disambiguation, user friendly retrieval interface (www.nvtud.hu/hhcl. frequency database of the entries, on-line compilation of the dictionary entries with the XML module of the Corel Office 2000 WordPerfect 9 program. Presentation of the TEI based DTD of the dictionary

University of Szeged

Az Európai Médiafigyelő (EMM) magyar változata

Author: Pajzs Júlia
Publication venue
Publication date: 01/01/2014
Field of study

A Közös Kutatóközpont – Europa (European Joint Research Centre) által fejlesztett európai médiafigyel (http://emm.newsbrief.eu) világszerte több ezer hírportálról automatikusan gyjti, és különféle kategóriákba sorolja a híreket, a nap 24 órájában, 10 percenként frissítve, nyelvtechnológia eszköztár használatával. Az MTA Nyelvtudományi Intézet Nyelvtechnológiai Kutatócsoportja együttmködési megállapodás keretében a szolgáltatás magyar nyelv mködését tette lehetvé. A magyar tulajdonneveknek az EMM rendszeren belüli felismerése és a toldalékolt változatok kezelése volt az elsdleges feladat. A nemzetközi jelentség híreket valamennyi feldolgozott nyelvi változatukban elérhetjük

University of Szeged

Media monitoring and information extraction for the highly inflected agglutinative language Hungarian

Author: Eszter Simon
Júlia Pajzs
Leonida Della Rocca
Maud Ehrmann
Mohamed Ebrahim
Ralf Steinberger
Stefano Bucci
Tamás Váradi
Publication venue: ELRA
Publication date: 01/01/2014
Field of study

The Europe Media Monitor (EMM) is a fully-automatic system that analyses written online news by gathering articles in over 70 languages and by applying text analysis software for currently 21 languages, without using linguistic tools such as parsers, part-of-speech taggers or morphological analysers. In this paper, we describe the effort of adding to EMM Hungarian text mining tools for news gathering; document categorisation; named entity recognition and classification for persons, organisations and locations; name lemmatisation; quotation recognition; and cross-lingual linking of related news clusters. The major challenge of dealing with the Hungarian language is its high degree of inflection and agglutination. We present several experiments where we apply linguistically light-weight methods to deal with inflection and we propose a method to overcome the challenges. We also present detailed frequency lists of Hungarian person and location name suffixes, as found in real-life news texts. This empirical data can be used to draw further conclusions and to improve existing Named Entity Recognition software. Within EMM, the solutions described here will also be applied to other morphologically complex languages such as those of the Slavic language family. The media monitoring and analysis system EMM is freely accessible online via the web pag

CiteSeerX

Repository of the Academy's Library

Akadémiai nagyszótár = The Historical Dictionary of Hungarian

Author: Gerstner Károly
Ittzés Nóra
Kenesei István
Kiss Margit
Mártonfi Attila
Pajzs Júlia
Pásztor Virág
Révész Katalin
Szentgyörgyi Rudolf
Publication venue: OTKA
Publication date: 01/01/2007
Field of study

Az elmúlt négy év folyamán, az előzetes terveknek megfelelően elkészült és megjelent a Nagyszótár első két kötete (www.nytud.hu/publ/nszt) 1119 és 1550 lapon. A publikált első kötet tartalmazza a szótár forrásanyagának teljes bibliográfiáját és az egyéb segédleteket: az Elekfi László által készített ragozási szótár kódjainak részletes feloldását. A második kötet az A-Azsúroz címszavak közötti szóállományt tartalmazza. A publikált köteteken túlmenően kéziratban legalább első változatban elkészültek már a B és C betűs szócikkek is, ezek szerkesztése, javítása folyamatban van. Az OTKA-keretből foglalkoztatott Varga Éva Katalin a szócikkíráson kívül részt vett a forrásjegyzék tételeinek filológiai ellenőrzésében és javításában. | The first two volumes of the Dictionary of the Hungarian Language have been published (on 1119 and 1550 pages, respectively) according to the objectives outlined in the grant proposal, cf. www.nytud.hu/publ/nszt. The first volume of contains the full bibliography and references of the sources of the Dictionary, as well as all other auxiliary materials, including the dictionary of inflections as compiled by László Elekfi. Volume II contains all entries between A and Azsúroz. In addition to the two volumes published, the frist versions of the entries beginning with the letters B and C have also been written, and their editing and philological supervision is under way. Katalin Éva Varga, whose employment was financed by the present grant took part in the preparation of dictionary entries as well as in the supervision and correction of the bibliographical items in the lists of sources

Repository of the Academy's Library